RAID-CUBE: The Modern Datacenter Case for RAID
نویسندگان
چکیده
“Big Data” processing in modern datacenters dramatically increases the data volume moving between applications and storage. A major challenge is achieving acceptable levels of availability and reliability in an environment characterized by huge storage capacities, large numbers of disk drives, and very high interconnection bandwidth (e.g., 100 petabytes and 17000 disk drives at CERN1). In this paper, we show that existing RAID mechanisms are insufficient, and that the mean time to data loss (MTTDL) falls drastically as the number of disks and data volume increase. We introduce a new high availability storage configuration, which we call RAID-CUBE, and show that it is more resilient to data loss as the datacenter scales in capacity than existing RAID dual parity and triple parity schemes. We also identify the limits to capacity of a datacenter (in terms of the number of disks) to maintain an acceptable MTTDL for different data protection mechanisms. Finally, we briefly introduce an effective mechanism for bit error protection for large sequential IOs in this environment.
منابع مشابه
Zoned-RAID for Multimedia Database Servers
This paper proposes a novel fault-tolerant disk subsystem named Zoned-RAID (Z-RAID). Z-RAID improves the performance of traditional RAID system by utilizing the zoning property of modern disks which provides multiple zones with different data transfer rates in a disk. This study proposes to optimize data transfer rate of RAID system by constraining placement of data blocks in multi-zone disks. ...
متن کاملSimulation and Modelling of Raid 0 System Performance
RAID systems are fundamental components of modern storage infrastructures. It is therefore important to model their performance effectively. This paper describes a simulation model which predicts the cumulative distribution function of I/O request response time in a RAID 0 system consisting of homogeneous zoned disk drives. The model is constructed in a bottom-up manner, starting by abstracting...
متن کاملUltimate Codes: Near-Optimal MDS Array Codes for RAID-6
As modern storage systems have grown in size and complexity, RAID-6 is poised to replace RAID-5 as the dominant form of RAID architectures due to its ability to protect against double disk failures. Many excellent erasure codes specially designed for RAID-6 have emerged in recent years. However, all of them have limitations. In this paper, we present a class of near perfect erasure codes for RA...
متن کاملS-RAID: Parallel RAID Architecture for Fast Data Recovery
As disk volume grows rapidly with terabyte disk becoming a norm, RAID reconstruction process in case of a failure takes prohibitively long time. This paper presents a new RAID architecture, S2-RAID, allowing the disk array to reconstruct very quickly in case of a disk failure. The idea is to form skewed sub-arrays in the RAID structure so that reconstruction can be done in parallel dramatically...
متن کاملRAIDq: A Software-friendly, Multiple-parity RAID
As disk manufacturers compete to build ever larger and cheaper disks, the possibility of RAID failures becomes more significant for larger and larger disk arrays, creating opportunities for products beyond RAID 6. In this paper, we present the design and implementation of RAIDq, a software-friendly, multiple-parity RAID. RAIDq uses a linear code with efficient encoding and decoding algorithms a...
متن کامل